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Abstract 

We study Eigen’s quasispecies model in the asymptotic regime where 
the length of the genotypes goes to oo and the mutation probability 
goes to 0. We give several explicit formulas for the stationary solutions 
of the limiting system of differential equations. 


1 Introduction 


Manfred Eigen introduced the quasispecies model in his 1971 celebrated ar¬ 
ticle about the hrst stages of life on Earth [7]. As a part of his article, 
Eigen constructed a model in order to explain the evolution of a popula¬ 
tion of macromolecules subject to selection and mutation forces. Given a set 
of genotypes a htness function A : Q —> [0,C)o[ and a mutation kernel 
M : Q X Q —> [0,1], Eigen’s model states that the concentration of the 
genotype v E Q evolves according to the differential equation 

<(t) = u{t)A{u)M{u, v) — Xruif) Y^x,(t)A(u). 

uGQ u&Q 

The hrst term accounts for the production of individuals having genotype n, 
production due to erroneous replication of other genotypes as well as faithful 
replication of itself. The negative term accounts for the loss of individuals 
having genotype n, and keeps the total concentration of individuals constant. 
Instead of studying the model in all its generality, Eigen considered the fol¬ 
lowing simplihed setting: 

Genotypes. They are sequences of hxed length over a hnite alphabet 
A of cardinality k. The set of genotypes is then A^. 
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Selection. It is given by the sharp peak landscape, i.e., there is a genotype 
w* G A^, called the master sequence, having htness a > 1, while all the other 
genotypes have htness 1. The htness function A : —> [0, cxo[ is thus given 

by 


Vm e A^ 


A{u) = 


a ii u = w*, 

1 ii u w*. 


Mutations. They happen during reproduction, independently at random 
over each site of the sequence, with probability q G [0,1]. When a mutation 
happens, the letter is replaced by another one, chosen uniformly at random 
over the k — 1 other letters of the alphabet. The mutation kernel is thus 
given by 


\/u, veA^ M{u, v) = {(1 - ^ 

\k — 1/ 


where d is the Hamming distance, i.e., the number of diherent digits between 
two sequences: 


Vm, V ^ A^ d{u, v) = card{ / G { 1, : u{l) y v{l) } . 


Eigen drew two main conclusions from the study of this simplihed model: 
there is an error threshold phenomenon for the mutation probability and a 
so-called quasispecies regime for subcritical mutation probabilities. Indeed, 
when the length of the sequences goes to cxo, an error threshold phenomenon 
arises: there exists a critical mutation probability, separating two totally 
different regimes. For supercritical mutation probabilities the population at 
equilibrium is totally random, whereas for subcritical mutation probabilities 
the population at equilibrium is distributed as a quasispecies, i.e., there is a 
positive fraction of the master sequence present in the population along with 
a cloud of mutants that closely resemble the master sequence. 


After Eigen’s proposal of the quasispecies model, many other authors have 
investigated it, both in the simple setting we have just presented and in more 
general settings. Eigen, McCaskill and Schuster [S] studied the model in great 
detail. As pointed out by them, one of the main challenges related to Eigen’s 
model is to hnd the distribution of the quasispecies: the concentration of 
the master sequence and the concentrations of the different mutants in the 
population at equilibrium. It is generally impossible to give explicit formulas 
for these concentrations. Jones, Enns and Rangnekar m and Thompson 
and McBride [25] give an exact solution of the quasispecies by linearising 
Eigen’s system of differential equations. In the same spirit, Swetina and 
Schuster m use this linearisation to characterise the stationary distribution 
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of the quasispecies as the eigenvector corresponding to the highest eigenvalue 
of the linearised system matrix. Saakian and Hu [IB] derive exact solutions 
for the quasispecies model by assuming a certain ansatz; Saakian [16] and 
Saakian, Biebricher and Hu D2I derive the distribution for several differ¬ 
ent htness landscapes, in particular for smooth landscapes. Novozhilov and 
Semenov [BHIBB] and Bratus, Novozhilov and Semenov obtain more 

concrete results for the quasispecies distribution for several special cases of 
the mutation kernel and the htness function. 


The aim of this article is to present a scheme in order to obtain explicit 
formulas. The key ingredients to this scheme are twofold: we break the 
space of genotypes into Hamming classes and we study the asymptotic regime 
where the length of the chains i goes to oo, the mutation probability goes to 
0 and iq goes to a g]0, oo[. The idea comes from the articles [HE], where 
the authors consider a Moran model in order to recover the error threshold 
phenomenon as well as the quasispecies for a hnite-population stochastic 
model. The Moran model is studied in the setting we have just introduced: 
genotypes given by A^, sharp peak landscape and independent mutations 
per locus. Eigen’s model is recovered in the inhnite population limit [6], 
the error threshold phenomenon is also recovered, and an explicit formula is 
obtained for the distribution of the quasispecies. We illustrate now how the 
two ingredients mentioned above make possible to obtain such a formula, by 
applying our scheme directly to Eigen’s model. 


Hamming classes. The genotype space A^ is broken into Hamming classes 
with respect to the master sequence. To this end we dehne the mapping 
H : A^ —)■ { 0, ...,£} by setting 

Vm G ii[u) = d{u,w*). 

The mapping H induces a htness function Ah : { 0,..., £ } —> [0, cxo[ on the 
Hamming classes, which is given by: 


V/e{0,...,£} Ah{1) 


a if Z = 0 , 

1 if 1 < Z < £. 


Likewise, the mapping H induces a mutation kernel Mh over the Hamming 
classes: for all Zj, c G { 0,..., £ }, 


Mnih^c) 


o<k<e-b ^ \ / 

0<l<b 
k—l=c—b 


b-l 
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This formula has been given hrst in |21] and later in a slightly different 
form in [15]. For k G {0, let us denote by Xk the concentration 

of individuals in the Hamming class k. According to Eigen’s model, the 
evolution of the concentrations is driven by the following system of differential 
equations: 

e £ 

= ^Xj{t)AH{j)MH{j, k) - Xkit)^Xj{t)AH{j ), 0 <k<e. 

j=0 j=0 

Asymptotic regime. We make the length of the chains go to oo and the 
mutation probability go to 0 in the following way: 

i —)■ oo , q —)■ 0 , iq —)-aG]0,oo[. 


In this asymptotic regime we obtain a limiting mutation kernel Moo given 
by: for all j, fc > 0, 


Moo{j,k) = 


A-j 


{k-j)\ 


a j > k. 


We can now write the limiting system of differential equations: 


k p, _j oo 

- Xk{t)^Xj{t)AH{j ), A: > 0 . 

i=o ^ 

The distribution of the quasispecies is the only positive stationary solution 
of the above system, which exists for values of a such that cre““ > 1, and is 
given by the formula 


pk — 



A; > 0. 


Our objective is to generalise this formula to htness functions / : N —> [0, oo[ 
others than the sharp peak landscape htness function. 


2 Results 


Let / : M —> [0, oo[. We consider the system of differential equations 






j=0 


j=0 
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and we look for the stationary solutions of the system, i.e., we want to solve 
the system of equations 


(5) 


0 


k 


3=0 






k > 0, 


where <h = Since we think of Xk as the concentration of the 

Hamming class k in a. population, we are only interested in non-negative 
solutions of the system (5). We say that {xk)k>o is a quasispecies associated 
to / if it is a non-negative solution of (5) such that xq > 0 and J2k>o^k = 1- 


Assumption. We suppose that the htness of the Hamming class 0 is 
higher than the fitness of all the other classes, i.e., the fitness function 
/ : N —)■ [0, cxd[ satisfies /(O) > /(/c), k > 1. 


Note that the hypothesis is coherent with the Hamming class 0 corresponding 
to the master sequence (the httest genotype). From now on, every htness 
function is assumed to verify this hypothesis. We hx one such htness function 
/ and we focus ourselves on hnding the quasispecies distributions associated 
to /. 


Let us remark that under this assumption, if {xk)k>o is a quasispecies, then 
the concentration Xk of the Hamming class k is strictly positive. Indeed, 
since we assume that Xq > 0, 


Xk 


0<j<k 




j>o 


> 


a:o/(0)e-“^ 

/(O) 


> 0 . 


The hrst of our results expresses the htnesses as a function of the concentra¬ 
tions of the quasispecies. 


Theorem 2.1. Let us suppose that {xk)k>o is a quasispecies associated to f. 
Then, 


'ik > 1 


r/i-, /(O) 

i=0 


The interest of this result lies in its potential applications. When performing 
practical experiments, the concentrations of the diherent genotypes can be 
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measured, and one delicate qnestion is to infer the nnderlying fitness land¬ 
scape. Recent progresses allow even to seqnence in-vivo virus populations, 
and the quasispecies model is one of the main tools employed in order to 
infer the fitness landscape from the experimental data [19] . 


We look now for an inverse formula, in other words, we want to express the 
concentrations of the different Hamming classes as a function of the htnesses. 
let {xk)k>o he a quasispecies associated to /. The equation for fc = 0 in the 
system (5) is 

0 = a;o(/(0)e-“-<l>). 

Since we suppose that xq > 0, we have $ = /(0)e““. Replacing <h by /(0)e““ 
in (iS) we obtain a recurrence relation for {xk)k>o- To begin with, we will try 
to solve the recurrence relation with initial condition equal to 1, i.e.. 


(R) 


!/ci = 1, 

^ k—1 k—j 


Lemma 2.2. Let {yk)k>o be the solution of the recurrence relation (Jl). 

• If the series associated to {yk)k>o converges, there exists a unique quasis¬ 
pecies {xk)k>o associated to f, which is given by: 

xo = i'^ykj , Xk = xoyk , k>l. 

^ k>0 ' 


• If the series associated to {yk)k>o diverges, no quasispecies associated to f 
exists. 


Proof. The first statement of the lemma is obviously true. For the second 
one, note that if {xk)k>o is a quasispecies associated to /, then the sequence 
{yk)k>o defined hj yk = Xk/xo, k > 0, satisfies the recurrence relation (IZ), 
and the series associated to {yk)k>o converges. □ 

Next we give three different explicit formulas for the sequence {yk)k>o- The 
hrst of the formulas involves multinomial coefficients. 

Theorem 2.3. For all k > 1, 

^ /(O) f. ^ _TT /(h) ^ 

k\f(0)-f{k)[ + 11/(0)-/(i,)j’ 
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Up—down coefficients. The sequence {yk)k>o can also be expressed in 
terms of up-down coefficients. Let us first introduce the up-down numbers 
or coefficients [23]. Let n > 2 and let (gi,..., qn-i) G { —1,1 We say 
that a permutation a = (ct( 1), ..., cr(n)) of { 1,..., n } has Niven’s signature 
(gi,..., gn-i) if for every i G { 1,..., n — 1 }, the product + 1) — cr(i)) 

is positive |14j . 


Definition 2.4. Let n>2, 0<h<n and 0 = io < ii < ■ ■ ■ < ih < n. The 
up-down coefficient 

UO) ■ ■ ■ ) J 

is dehned as the number of permutations of { 1,..., n } having Niven’s sig¬ 
nature (gi,..., gn-i) given by 


Vi e { 1,..., n — 1} 



if i e {ii,... ,ih} , 

otherwise. 


Theorem 2.5. For all k > 1 


Vk = 


m 


k! 


n /(o)- f(j) 


E 


0<h<k 

0=2o<"*<'ih<^ 


k 

^ 0 ; • • • •!^h 


n 

t=0 


fin) 

/(O) 


Our last result concerns htness functions that are eventually constant. For 
such functions we can express the concentrations {yk)k>o in terms of the con¬ 
centrations {qk)k>o of the quasispecies associated to the sharp peak landscape 
htness function. Let / be a htness function which is eventually constant equal 
to c > 0. Dehne {qk)k>o as the solution to the recurrence relation (TZ) with 
htness function (/(O), c, c,...), i.e., 

^-(/(o)/c-4Ei^. 

Theorem 2.6. Let N > 0 be such that 

f{N) 7 ^ c and ffk > N f(k) = c. 

Then, for all k > N, 


N 


E LL'' 
— 


i=i 


J'- 


qk-1 


fU) 


X 


/(0)-/(j) /(0)-c 

i-l h 

E n 

h=l 0=io<---<ih<j t=l 


J - h-1 


fih) 


j - it J /(O) - f{it) 
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Finally we give a condition guaranteeing the existence of a quasispecies as¬ 
sociated to /. Let us recall that is a quasispecies associated to / if 

it is a non negative solution of (5), xq > 0 and the sum of the x^s is 1. 

Corollary 2.7. We have: 

• ///(0)e““ > hmsup„_^oo /(^); series associated to {yk)k>o converges 
and there exists a unique quasispecies associated to f. 

• ///(0)e““ < liminf^^oo the series associated to {yk)k>o diverges and 
no quasispecies associated to f exists. 


We remark that for htness functions / (verifying our assumption), the above 
corollary corresponds to the error threshold phenomenon observed by Eigen. 
Moreover, the error threshold depends only on a, /(O), and the limiting 
behaviour of the htness function /. 

To hnish this section, we discuss the motivations for making the assumption 
/(O) > /(/c), /c > 1, and why we are mainly interested in solutions {xk)k>o of 
(S) satisfying xq > 0. Let iF > 0, we call {xk)k>o a quasispecies distribution 
around K associated to /, if it is a non negative solution of (S) such that 

Xq = ■ ■ ■ = xk-1 = 0 < xk and Xk = I ■ 

k>K 

Lemma 2.8. Let LF > 0, and define the mapping g : N —)■ [0, cx)[ by 

VA; > 0 g{k) = f{K + k). 

The sequence {xk)k>o is a quasispecies distribution around K associated to f 
if and only if the sequence {xK+i)i>o is a quasispecies distribution around 0 
associated to g. 

Proof. Let the sequence {xk)k>o be a quasispecies distribution around K 
associated to /. Since xq = ■ ■ ■ = xk-i = 0, for all A; > iF we have 

j=K ^ Hi' j>K 


We set i = k — K and h = j — K m. the above formula and we see that for 



all i > 0, 


0 = ^ XK+hf{K + h)e~‘" - XK+i ^ XK+hf{K + h) 

h=0 '' h>0 

^ (f—h 

= ^ XK+hg{h)e~^ _ - XK+i ^ XK+hg{h) . 

h=0 ^ '' h>0 

Therefore, the sequence {xK+i)i>o is a quasispecies distribution around 0 
associated to g. The converse implication is proved similarly. □ 

Lemma 2.9. Suppose there exists K >1 such that f{K) > ma,XQ<k<K f {k). 
Then, for k E {0,..., K — 1}, no quasispecies distribution around k associ¬ 
ated to f exists. 


Proof. Let us suppose that the sequence {xk)k>o is a solution of (5). Let 
k E {0, — 1} and let us suppose further that Xq = ■ ■ ■ = Xk-i = 0 and 

Xk 7 ^ 0. We will show that if > 0,..., xk-i > 0, then necessarily xk < 0. 
On one hand, writing down the i^-th equation of (iS) we see that 


Xk 


K-l 


$ - f{K)e- 




,K-j 


j=k 


{K-3)\ 


On the other hand, writing down the k-th. equation of (5), since xq = ■ ■ ■ = 
Xk-i = 0 and Xk > 0, we conclude that $ = /(/c)e““. Since f{k) < f{K), if 
Xk > 0,... ,xk-i > 0, necessarily xk < 0. This implies that no quasispecies 
distribution around k associated to / exists. □ 


The above lemmas justify the hypothesis on the htness function /, as well as 
the search for quasispecies distribution around 0 associated to /. From now 
onwards, if {xk)k>o is quasispecies distribution around 0 associated to /, and 
when there is no confusion, we will simply say that {xk)k>o is a quasispecies. 


3 Related results 


We have given three different explicit formulas for the stationary solutions 
of the system: 

k k—j °° 

-Xk{t)J2^iit)f{j ), k>0, 

j=o j=o 
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As we have pointed out in the introduction, this inhnite system of differential 
equations arises from Eigen’s system of differential equations: 

i e. 

= '^Xj{t)f{j)MH{j,k) - Xk{t)'^Xj{t)f{j), 0<k<i, 

j=0 j=0 

when considering the asymptotic regime 

i—> oo , q —)■ 0 , iq —)-aG[0,cxo[. 

Eigen’s system of differential equations might be dehned with greater gen¬ 
erality: given an at most countable set of types Q, a non negative htness 
function / on Q, and a stochastic matrix M = (^M{u,v),u,v G Q), Eigen’s 
model becomes 

(*) Kit) = - Xy{t)'^Xu{t)f{u), ueg. 

u£Q uGQ 

Dehne the matrix W by setting 

yu,v&Q, W{u,v) = f{u)M{u,v). 

For a hnite state space Q and under the hypothesis that the matrix W is 
irreducible, an application of the Perron-Frobenius theorem for positive ma¬ 
trices shows that the system (*) has a unique stationary solution which is 
globally stable [HlHlITniES] • ^ similar result was proven by Moran [T2] for a 
discrete-time version of this model: 

'^Xuin)f{u)M{u,v) 

(**) a;^(n + l) = -, ueQ. 

J2^Kn)f{u) 

u&Q 

Once again, an application of the Perron-Frobenius theorem shows that the 
dynamical system (**) has a unique hxed point, which is globally stable. 
Of course, the stationary solution of the continuous dynamical system and 
the hxed point of the discrete dynamical system are the same. Moran also 
extended this result [T^[T5] to the case where ^ = Z and mutations only hap¬ 
pen between nearest neighbours, i.e., for q g] 0, 1/2[ and i G Z, the mutation 
matrix M is dehned by: 




Q 

l-2q 

0 
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if j = i ± 1, 
if j = i, 
otherwise. 




Kingman m further generalises Moran’s result. Let ^ = N and make the 
following assumptions: 

• The fitness function is positive and bounded, i.e., there exists a constant 
C > 0 such that 

Vfc > 0, 0 < f{k) < C. 

• The mutation matrix M is irreducible and aperiodic. 

Let A be the spectral radius of the matrix W. Kingman then shows that if 

hmsup/(/c) < A, 

fc—)-oo 

then there exists a unique positive fixed point of (**) having 1 as the sum 
if its components. Moreover, this fixed point is globally stable. Kingman’s 
result generalises the hrst statement of our corollary 12.71 Indeed, in our 
setting cre““ corresponds to the spectral radius A. Our result, however, does 
not follow directly from Kingman’s result, for he assumes the matrix W to be 
recurrent, which is not verihed in our case. Kingman’s proof, which is based 
on an inhnite dimensional version of the Perron-Frobenius theorem, could be 
extended to show the existence of a quasispecies, but not the uniqueness. We 
have therefore chosen to exploit the obtained explicit formulas to derive an 
analogous of Kingman’s result directly. This procedure has not only allowed 
us to retrieve Kingman’s result in our particular setting, but also to give a 
similar condition under which a quasispecies cannot be formed. 


4 Proof of theorems 12.IL 12.31 and 12.5 


Proof of theorem l3. il Let us suppose that {xk)k>o is a quasispecies. Let us 
show that, for all fc > 1, 


fik) 


m 

Xk 




j=0 


r- 


^k—j 


We will make the proof by induction. The sequence {xk)k>o is a quasispecies, 
in particular, xq > 0 and <F = /(0)e““. Replacing $ by /(0)e““ in (5) and 
arranging the terms gives 

f(k) = L, k>i. 
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In particular, for /c = 1, 


/(I) = - axo). 

Xl 

So the result holds for /c = 1. We hx now k > 1 and we suppose that the 
result holds up to fc — 1. We replace the values of /(I),..., f{k — 1) in the 
above formula and we obtain 


fik) = 


Xk 


j=0 h=0 




Let us £x z G { 1,..., fc } and let us look for the coefficient of Xk-i in the 
above expression, this coefficient is 


„h „k—j _ „h „i—h „i 

^ ’ h\ (k-j)\ ^ ^ ’ h\(i-h)\ ^ ^ z! ’ 

0<h<j<k ^ ' 0<h<i ^ ' 

j—h=k—i 

which concludes the proof of the theorem. 


□ 


Proof of theorem \2.‘A We show that, for all fc > 1, 


Vk = 


1 + E 


k\ 


n 


/(h) 


/(O) 

k'-f{0) - f{k)\^ ' iiK^2-iiy----{k-thy-fJ[f{0)-f{it) 

l<il<---<ih<k 


Arranging the terms in (TZ) gives: 


yk{f{0)-f{k)) = ^ 


> 1 . 


We make the following changes of variables: 


^0 —Vo , 


9(0) = /(O), 
9(9) = 


/(O) - /(9) ’ 

With these changes of variables, the recurrence relation becomes 

k-\ 


J>1- 


Zk — Zj 
j=0 


gjj) 

{k-jyr 


k>l. 
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We iterate this formula and we obtain, for all A; > 1, 

We replace Zo,Zk and (/(O), • • • ,g{k — 1) by their respective values and we 
obtain the desired result. □ 


k-l 


Zk = zog{0) 1^ + 5^ 5^ 


h=l l<ii<---<ih<k 


ii\{i2 - ii)! ■■■{k- ih)\ ^ 


Proof of theorem \2.5[ We show that, for all fc > 1, 


Vk — 


/(O) 


k\ 


n /(o)- fij) 


E 


0<h<k 
0=io<---<ih<^ 


k 

^0: • • • : 


I{k) 

m' 


n 


We take the formula from theorem 12.31 and we set 11^=1 (/(O) “ /(i)) ^ 

common denominator, we get 


Vk 


/(O) 

*=' n (/(o)-/o)) 

l<j<k 


n (m-f(f)) 

l<j<k 


k-l 

+ E E 

h=l 


k\ 

*i!(* 2 - *i)!... (A; - ih)\ 


if/w n 

*=i i<i<fc 


(/(o)- /(j))) . 


The expression in the large parenthesis is an homogeneous polynomial of de¬ 
gree A: — 1 in the variables /(O),..., f{k—l). For each hG {!,...,A: — 1} and 
1 < Ai < • • • < < A:, we get a monomial of the form /(0)^“^“^/(ii) ■ ■ ■ f{ih)- 

We calculate the coefficient of each of these monomials and we conclude that 


Vk = 


a 


/(O) 


'=■ n w-M) 

l<j<k 


k—1 h 

/(oF‘+E E 

h=l l<ii<---<iii<k 


t=l 


X 


h 


■i)'‘+E E (-1) 


h—t 


k\ 


'^ji • lb'2 y 


We know from [3] that 


k 


0 , ,..., 

which implies the desired result. 


it+E E (-1) 

t=l l<ji<---<jt<h 


h—t 


{k-ij,)\ 


k\ 


b'iK*i2 *ii)- • • • b't)- 

□ 
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2.6 


5 Proof of theorem 


Let us introduce some notation before jumping into the proof of the theorem. 
For a htness landscape / and fc > 0, we dehne the htness landscape 
obtained by shifting k places to the left the htnesses of the different classes 
and keeping the htness of the class 0, that is, 


Vj>0 = 


/(O) ifj = o, 

f{j + k) ifj>l. 


• 

i 

1 

/ 


;(i) 


< 

• 

» 

T n n n 

< 

• 

» 

T n n n T 


T n n n n 


;( 3 ) 


T n n n n T 


For a htness landscape /, we denote by {yk{f))k>i the solution to the recur¬ 
rence (JZ) corresponding to the htness landscape /. We start by establishing 
the following lemma, which expresses the value of yk{f) as a function of 




Lemma 5.1. For all k > 2, we have 


ykif) 


/(O) 

ldf{0)-f{k) 
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Proof. Consider the identity of theorem 12.31 


/(O) A , ^ _TT /(^t) \ 

k\ /(O) - f{k) \ - *i)! • • • (fc - ^.)! 11 /(O) - 

and decompose the above snm according to the value of the hrst index: 



E 


k\ 


n 


/(h) 


k-l 

E 


/(h) 


l<h<k 
l<il<---<ih<k 


X 


h!(h - h)! ...{k- 4)! /(O) - /(b) /(O) - /(ii) 

fc! « A /(»,) 


+ E 


n 


- 'i)' '‘'(*2 - 'i)! ■■■(*;- 4)! /(O) - /(■!,) 


ii<i2<---<ih<k 

We make the following changes of variables: 


j = ii, h' = h — 1, t' = t — 1, 
h h h 5 ■ ■ ■ ) 4' 4 h ■ 

Note that in particular we have /(v) = /(4 + h) = previous 

expression becomes: 


^ k\ f{j) 

^j!(fc-j)!/(0)-/(/) 


X 



E 

l<h'<k-j 

i<i[<---<i'f^<k—j 



jk-])'. fr A>(4) \ 


Since /(O) = /^-^^O) f{k) = f^^\k — j) for all j G { 1,..., fc — 1}, taking 
away the ' from the indexes, we see that 


Vkif) 


/(O) f{j) 

k\ /(O) - J /(O) - f(j) '"'{k-jy. /o)(0) - /0)(fc - j) 


X 



E 

l<h<k—j 

l<il<---<ih<k-j 


ik-j)l fr /(^)(h) \ 

ii\{i2 - iiV-■ ■ ■ {k - j - thV- IJ- -/(^Hb)/ ’ 
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Yet, by theorem 12.31 


Vk 




.k-j 


/ 0 )( 0 ) 


{k-j)\fO){0)-fU\k-j) 

{k-jy. 


X 




1 + y y __TT 

h - *■)! ■■■('=- J fi /“(o) - /“(«; ■ 


We replace in the previous formula and we conclude. 


□ 


Let / : M —> [0, oo[ be a htness function which is eventually constant, i.e., 
there exist Y > 0 and a positive constant c such that 

f{N) 7 ^ c et \/k > N f{k) = c. 

Let (?/fc(/));.>Q be the solution to the recurrence relation (JZ) for the htness 
function /. We want to show that, for all k > N, 


N 


Vkif) = (lk + ^ — 


Qk-j 


i=i 


/(j) 


/(O) - /(i) /(O) - c 


X 


n(X'r) 7 s&)' 


where {qk)k>o is the solution to the relation of recurrence (JZ) for the sharp 
peak htness landscape (/(O), c, c,...), i.e.. 

Before proceeding to the proof of the theorem 12.61 we introduce the following 
notation in order to simplify the expression of the formula we want to prove. 
For a htness function / and / > 1, we set 


i-i 


o(/) = i+E E n 

h=l 0=io<ii<---<ih<l *=1 


I - it-1 


/(h) 


l-it J /(o) -/(h) ■ 

Lemma 5.2. The coefficients Ci, i>2, satisfy the recurrence relation 

fU) 


i—1 


o(/) = 1+E 


r/ \jj /(O) - /(/) 
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Proof. Let i >2. For j e { 1,..., i — 1}, 


i-j-l 



f (j + h) 


= 1 + 5 ^ 


f(0)-fU + h)' 


h=l 0=/o<^i<-"<^h,<^~J’ 1 


We replace (Fj-i in the above formula, and we change 

the indexes in the following way: 

h' = h + 1, j = ii, j + li = ^2, ■■■ , j + lh = ih' ■ 

Exchanging the order of the sums gives the desired formula for Ci{f). □ 

Proof of theorem \2.b\ We show the result by induction on N. Let us suppose 
hrst that N = 1 and let k >2. Then all the htness functions f^^\ j > 1, 
are equal to the sharp peak landscape htness function. Applying lemma 15.11 


gives: 



Yet, the sequence {qk)k>o satishes the recurrence relation (fR.) for the htness 
function (/(O), c, c,...), i.e.. 



It follows that 



The base case Y = 1 is thus settled. Let now N >2 and let us suppose that 
the result of theorem 12.61 holds up to Y — 1. Let k > N. On one hand, for all 
j P N, the htness function f^^^ is equal to the sharp peak landscape htness 
function, therefore yk-j{f^^'*) = Qk-j for all j G { 1,..., Y }. On the other 
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hand, f {N + 1) = ■ ■ ■ = f (k) = c. Thus, applying lemma ED gives 


Vkif) = 


a" /(O) 


k-l 




/(O) - f{k) j\ /(O) - /(j) 


f{0)-c\k\ 


k-l 




I ^k-j 


J=1 


N 


C V "v (I 

f(0) - c ^ j! 


I 


TV 




J = 1 


j! /(o) - fOT 


N 


N 


= ® - 7(oP 7 E + E y ■ 


By the induction hypothesis, for all j G { 1,..., }, we have 

c 


Uk-jif ) — Qk-j + / , ,| Qk-j-l 


1=1 


/O)(0)-/(i)(/) /(i)(0)-c 



^ / /(j + 0 

Ik-j + ,, Qk-j-l 


1=1 


fiO)-f{j + l) /(0)-c 




We replace yk-N{f^^^), ■ ■ ■, Vk-iif^^^) in the formula for ykif) and we obtain 


( r\ _ _ c \ fjj) 

VkU) - * - C j! ^ j\ /(O) - /(i) 


v-i I 
a 


X j Q'fc-j + Qk-j-i 


1=1 


fU +1) 


fiO)-fij + l) /(0)-c 




Let us £x i G { 1,..., At }. The coefficient of qk-i in the development of yk{f) 
is then equal to: 


a* ( f{i) 


*!\/(0)-/(f) /(O)-c^ 

[ fU + 0 


E 

i<i<v 

i<;<7V-j 

j+l=i 


V/(0)-/(j + 0 /(0)-c 




a* f f{i) 


)fi+Ef; 


*!V/(0)-/(*) ^ VJ7/(O) -/(j) 


/(i) 
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We conclude thanks to lemma 15.21 


□ 


6 Proof of the corollary 


We begin by giving two useful lemmas. For a fitness function / : N —> [0, cxo[ 
we denote by {yk{,f))k>o the solution to the recurrence relation (Jl) corre¬ 
sponding to the function /. 

Lemma 6.1. Let f,g:N —> [0, oo[ be two fitness functions satisfying both 
/(O) = (7(0) and f{k) > g{k) for all k > 1. Then, for all k > 0, we have 
ykif) > ykig)- 


Proof. The result follows from the inequality 


fc-i 


/(O) - f{k) 


5^%'/O') 


p-j 


k-l 


> 


{k-3)\ 9{0)-g{k)^^ 




p-j 




along with an induction argument. 


□ 


Let N > 1 and a > c > 0. We dehne the fitness function g^^ : N —> [0, 00[ 
by setting: 


{ a if fc = 0 , 

0 if 1 < fc < iV, 

1 if iV + 1 < fc. 

Lemma 6.2. The series associated to {yk{gN))k>o converges if and only if 
ae~°' > c. 

Proof. We know the result to be true for the sharp peak landscape, i.e. for 
N = 0. By the comparison lemma 16.11 if ae~°‘ > c the series associated 
to {yk{gN))k>o converges. Suppose next that cTe““ < c. By lemma 12.21 
the convergence of the series associated to {yk{gN))k>o is equivalent to the 
existence of a quasispecies associated to g^. We will thus show that such 
a quasispecies cannot exist if cre““ < 1. Let us suppose that a quasispecies 
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{xk)k>o exists. The sequence {xk)k>o then verihes: 

xo > 0, y^^Xk = 1 , 

k>0 

<h = axp + c Xfc , 

k>N 

0 = xo(cre““ — <h), 

k 

0 = xoa—e~°' — Xk^ , 1 < k < N. 

k\ 

In particualr $ = (je~°‘ and Xk = xpa^/kl for 1 < A; < A^. Thus, 


ere “ = <h = axp + c(l - (xq H-h x^)) = 


Let 


tAr(a) 


We conclude that Xp is given by 


E 



xp 


ae — c 
(7 — ct]^[a) 




Xp + c. 


Denote by a* the only positive solution to the equation ct]\f{a) = a. The ex¬ 
pression obtained for xp is not positive for a G [hi(cT/c), a*[, so a quasispecies 
cannot exist for a in this interval. If on the contrary a > a*, we have 


Xo H-h Xat = tN{a)xp 


ae °‘tisf{a) — ctN{a) 
a — ct]sf{a) 


However, tisf{a) < e“, which implies that this last expression is strictly larger 
than 1. Thus, a quasispecies cannot exist if a > a* either. □ 


We proceed now to the proof of the corollary. After lemma 12.21 corol¬ 
lary 12.71 will be settled if we manage to show that the series associated to 
{yk)k>o converges if /(0)e““ > limsup^^oo/(n), and diverges if /(0)e““ < 
lim inf„_j.oo/(’^)- Let us start by showing the former. Suppose hrst that the 
function is constant equal to c > 0 from N onwards. We can thus apply 
theorem 12.61 and obtain 


N 


\fk> N , yi. = qf. + —Qk-j 

i=i 


fU) 


/(O) - fU) /(O) - c 


Q(/) 
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It follows that 


N 


N 




fc >0 


fc =0 


k>N 


i=i 


k>N 


f{j) 


/(0)-/(j) /(0)-c 


CAf)- 


Yet, the series associated to {qk)k>o is convergent for /(0)e “ > c. If the 
fnnction / : N —> [0, C)o[ is not eventually constant, we set 


c°° = limsup/(n) . 

n—>-co 


Let e > 0, pick Y > 0 large enough so that for all k > N, f{k) < c°° + e. 
We dehne the function : N —)■ [0, oo[ by: 


VA: > 0 



f{k) ifO<k<N, 
c°° + £ if A: > Y . 


For e small enough, /^(0)e““ > c°° + e. Since is constant equal to c°° + e 
from Y onwards, the series associated to {yk{f^))k>o converges. By the 
comparison lemma lOl the same is true for the series associated to {yk{f))k>o- 
We prove next that if /(0)e““ < liminf„_,.oo /(n), then the series associated 
to {yk)k>o diverges. We dehne the function /^r : N —> [0, cx)[ by: 

( /(O) ifA: = 0, 

VA: > 0 fN{k) = < 0 if 1 < A: < Y, 

[ Coo — e ii k > N. 

For e small enough, /^(0)e““ < c°° — e. After lemma 16^ the series associated 
to {yk{f^))k>o is divergent. By the comparison lemma lOl the same is true 
for the series associated to {yk{,f))k>o- 


7 Conclusions 


We have given several explicit formulas for the stationary solutions of Eigen’s 
quasispecies model in the regime where the length of the genotypes goes to 
inhnity. Theorem 12.11 allows the inference of the htness landscape from data 
about the concentrations of the different genotypes, which makes it particu¬ 
larly attractive for applications. The formulas in theorems 12.31 and 12.51 give 
the concentrations of the different Hamming classes, relative to the master 
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sequence. For fitness landscapes which are eventually constant, a link is 
made to the already known distribution of the quasispecies (for the sharp 
peak landscape 0) in theorem 12.61 Finally corollary 12 . 71 generalises the error 
threshold criterion observed for the sharp peak landscape to htness land¬ 
scapes depending on the Hamming class. The main interest of our results 
lies in their exact nature; the only approximation they rely on is the long 
chain regime, which even the simplest genomes in nature fall into. The main 
limitation of our work is the assumption that the htness of an individual 
depends on its genome only through the number of point mutations from the 
master sequence. Nevertheless, we believe that our results provide a hrst step 
in hnding quasispecies distributions for even more general htness landscapes. 
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